In this notebook, we explore the MAIAC variable “injection height” to see if it may be useful in our analysis.

aod <- maiac_loadRaster("h01v04", 20171009, product = "MAIACTAOT", converterPath = "../executables/h4toncff_nc4") 
height <- maiac_loadRaster("h01v04", 20171009, product = "MAIACTAOT", converterPath = "../executables/h4toncff_nc4",
                           param = "Injection_Height") 
hist(values(height), 100)

sum(is.na(values(height)))
## [1] 1434255
sum(!is.na(values(height)))
## [1] 5745
hist(values(aod), 100, xlim = c(0, 2.5))

hist(values(aod)[!is.na(values(height))], 100,  xlim = c(0,2.5))

plot(height)

plot(aod)

plot(values(aod), values(height))

aod_geo <- projectRaster(aod, crs = CRS("+proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs +towgs84=0,0,0"))%>% 
  crop(raster::extent(-125.52, -117.03, 36.52, 41.78)) 
height_geo <- projectRaster(height, crs = CRS("+proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs +towgs84=0,0,0"))%>% 
  crop(raster::extent(-125.52, -117.03, 36.52, 41.78)) 

plot(aod_geo)
plot(USCensusStates, add = TRUE)
monitorMap(monitorSlice, add = TRUE)

plot(height_geo)
plot(USCensusStates, add = TRUE)
monitorMap(monitorSlice, add = TRUE)

Injection_Height is almost completely comprised of “na” values, so may be unhelpful. Let’s take a look at some of the other days/ times to see if that is consistent.

height2 <- maiac_loadRaster("h01v04", 20171009, 2150, product = "MAIACAAOT", converterPath = "../executables/h4toncff_nc4",
                           param = "Injection_Height") 
## /Users/helen/Data/maiac/MAIACAAOT.h01v04.20172822150.nc already exists. Skipping conversion
plot(height2)

height3 <- maiac_loadRaster("h01v04", 20171010, 1915, product = "MAIACTAOT", converterPath = "../executables/h4toncff_nc4",
                           param = "Injection_Height")
## /Users/helen/Data/maiac/MAIACTAOT.h01v04.20172831915.nc already exists. Skipping conversion
aod3 <- maiac_loadRaster("h01v04", 20171010, 1915, product = "MAIACTAOT", converterPath = "../executables/h4toncff_nc4")
## /Users/helen/Data/maiac/MAIACTAOT.h01v04.20172831915.nc already exists. Skipping conversion
plot(height3)

plot(aod3)

height4 <- maiac_loadRaster("h01v04", 20171010, product = "MAIACAAOT", converterPath = "../executables/h4toncff_nc4",
                           param = "Injection_Height")
## /Users/helen/Data/maiac/MAIACAAOT.h01v04.20172832055.nc already exists. Skipping conversion
aod4 <- maiac_loadRaster("h01v04", 20171010, product = "MAIACAAOT", converterPath = "../executables/h4toncff_nc4")
## /Users/helen/Data/maiac/MAIACAAOT.h01v04.20172832055.nc already exists. Skipping conversion
plot(height4)

plot(aod4)

height5 <- maiac_loadRaster("h01v04", 20171013, product = "MAIACAAOT", converterPath = "../executables/h4toncff_nc4",
                           param = "Injection_Height")
## /Users/helen/Data/maiac/MAIACAAOT.h01v04.20172862125.nc already exists. Skipping conversion
aod5 <- maiac_loadRaster("h01v04", 20171013, product = "MAIACAAOT", converterPath = "../executables/h4toncff_nc4")
## /Users/helen/Data/maiac/MAIACAAOT.h01v04.20172862125.nc already exists. Skipping conversion
plot(height5)

plot(aod5)

It is always missing a lot. Let’s see if it improves our model at all.

mod1 <- lm(monitor~satellite, data = dfall)
mod2 <- lm(monitor~satellite + height, data = dfall)
mod3 <- lm(monitor~satellite + height + height:satellite, data = dfall)
mod4 <- lm(monitor~satellite + height:satellite, data = dfall)

# Model 1: just AOD as a predictor
summary(mod1)
## 
## Call:
## lm(formula = monitor ~ satellite, data = dfall)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -136.671   -7.753   -2.833    3.315  166.495 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.0470     0.9162   3.326 0.000917 ***
## satellite   118.4089     3.3405  35.447  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.65 on 913 degrees of freedom
##   (357 observations deleted due to missingness)
## Multiple R-squared:  0.5792, Adjusted R-squared:  0.5787 
## F-statistic:  1256 on 1 and 913 DF,  p-value: < 2.2e-16
# Model 2: AOD and injection height as predictors
summary(mod2)
## 
## Call:
## lm(formula = monitor ~ satellite + height, data = dfall)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -157.506   -7.650   -2.561    3.455  158.747 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.44014    0.94144   1.530    0.126    
## satellite   135.05742    4.35242  31.030  < 2e-16 ***
## height       -0.04251    0.00730  -5.823 7.98e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.25 on 912 degrees of freedom
##   (357 observations deleted due to missingness)
## Multiple R-squared:  0.5942, Adjusted R-squared:  0.5934 
## F-statistic: 667.8 on 2 and 912 DF,  p-value: < 2.2e-16
# Model 3: AOD and injection height and interaction between AOD and injection height
summary(mod3)
## 
## Call:
## lm(formula = monitor ~ satellite + height + height:satellite, 
##     data = dfall)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -157.956   -7.575   -2.429    3.608  158.564 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.22128    0.94689   1.290   0.1975    
## satellite        136.10435    4.37985  31.075   <2e-16 ***
## height            -0.01283    0.01706  -0.752   0.4523    
## satellite:height  -0.02817    0.01463  -1.925   0.0545 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.22 on 911 degrees of freedom
##   (357 observations deleted due to missingness)
## Multiple R-squared:  0.5959, Adjusted R-squared:  0.5946 
## F-statistic: 447.8 on 3 and 911 DF,  p-value: < 2.2e-16
# Model 4: AOD and interaction between AOD and injection height
summary(mod4)
## 
## Call:
## lm(formula = monitor ~ satellite + height:satellite, data = dfall)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -156.359   -7.557   -2.397    3.609  158.763 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.232560   0.946547   1.302    0.193    
## satellite        135.556497   4.317807  31.395  < 2e-16 ***
## satellite:height  -0.038123   0.006253  -6.097 1.59e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.22 on 912 degrees of freedom
##   (357 observations deleted due to missingness)
## Multiple R-squared:  0.5956, Adjusted R-squared:  0.5948 
## F-statistic: 671.7 on 2 and 912 DF,  p-value: < 2.2e-16
# Does adding a term for height improve the model?
anova(mod1, mod2)
## Analysis of Variance Table
## 
## Model 1: monitor ~ satellite
## Model 2: monitor ~ satellite + height
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    913 468424                                  
## 2    912 451630  1     16794 33.913 7.982e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Does adding an interaction term for height improve the model?
anova(mod2, mod3)
## Analysis of Variance Table
## 
## Model 1: monitor ~ satellite + height
## Model 2: monitor ~ satellite + height + height:satellite
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1    912 451630                              
## 2    911 449800  1    1830.6 3.7075 0.05448 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Does taking out the height term (but not the interaction term) make the model worse?
anova(mod3, mod4)
## Analysis of Variance Table
## 
## Model 1: monitor ~ satellite + height + height:satellite
## Model 2: monitor ~ satellite + height:satellite
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1    911 449800                           
## 2    912 450079 -1    -279.2 0.5655 0.4523
# Does adding an interaction term for height (but no term just for height) improve the model? 
anova(mod1, mod4)
## Analysis of Variance Table
## 
## Model 1: monitor ~ satellite
## Model 2: monitor ~ satellite + height:satellite
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    913 468424                                  
## 2    912 450079  1     18345 37.173 1.594e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

It looks like adding injection height as an interaction term with AOD improves the model. However, we only have 72 values out of 1272 for ‘height’. The model looks like this:

\[ PM_{2.5} = 1.2 + 135.55 \cdot AOD - 0.038\cdot AOD\cdot Injection\_height + \epsilon_i \]

This says that, when injection height is 0, for each unit increase in AOD, we can expect PM2.5 to increase by \(135.55 \frac{\mu g}{m^3}\). As injection height increases, this relationship decreases. Thus, for a given AOD value, we would expect the ground PM2.5 value to be lower when injection height is high than when injection height is low or nonexistant.